External Memory Algorithms for String Problems

نویسندگان

  • Kangho Roh
  • Maxime Crochemore
  • Costas S. Iliopoulos
  • Kunsoo Park
چکیده

In this paper we present external memory algorithms for some string problems. External memory algorithms have been developed in many research areas, as the speed gap between fast internal memory and slow external memory continues to grow. The goal of external memory algorithms is to minimize the number of input/output operations between internal memory and external memory. These years the sizes of strings such as DNA sequences are rapidly increasing. However, external memory algorithms have been developed for only a few string problems. In this paper we consider five string problems and present external memory algorithms for them. They are the problems of finding the maximum suffix, string matching, period finding, Lyndon decomposition, and finding the minimum of a circular string. Every algorithm that we present here runs in a linear number of I/Os in the external memory model with one disk, and they run in an optimal number of disk I/Os in the external memory model with multiple disks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Faster External Memory LCP Array Construction

The suffix array, perhaps the most important data structure in modern string processing, needs to be augmented with the longest-common-prefix (LCP) array in many applications. Their construction is often a major bottleneck especially when the data is too big for internal memory. We describe two new algorithms for computing the LCP array from the suffix array in external memory. Experiments demo...

متن کامل

Obtaining Provably Good Performance from Suffix Trees in Secondary Storage

Designing external memory data structures for string databases is of significant recent interest due to the proliferation of biological sequence data. The suffix tree is an important indexing structure that provides optimal algorithms for memory bound data. However, string Btrees provide the best known asymptotic performance in external memory for substring search and update operations. Work on...

متن کامل

An Integrated, Fast and Scalable Approach for Biological Network Analysis

Analysis of biological networks has become a major challenge due to the recent development of highthroughput techniques which are rapidly producing very large datasets. A number of algorithms, techniques and applications have been proposed to infer new information from various biological networks. In practice, most of the available algorithms and related tools for exploring different properties...

متن کامل

Electromagnetism-like Algorithms for The Fuzzy Fixed Charge Transportation Problem

In this paper, we consider the fuzzy fixed-charge transportation problem (FFCTP). Both of fixed and transportation cost are fuzzy numbers. Contrary to previous works, Electromagnetism-like Algorithms (EM) is firstly proposed in this research area to solve the problem. Three types of EM; original EM, revised EM, and hybrid EM are firstly employed for the given problem. The latter is being firstl...

متن کامل

Engineering External Memory LCP Array Construction: Parallel, In-Place and Large Alphabet

The suffix array augmented with the LCP array is perhaps the most important data structure in modern string processing. There has been a lot of recent research activity on constructing these arrays in external memory. In this paper, we engineer the two fastest LCP array construction algorithms (ESA 2016) and improve them in three ways. First, we speed up the algorithms by up to a factor of two ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Fundam. Inform.

دوره 84  شماره 

صفحات  -

تاریخ انتشار 2008